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Abstract. We discuss the use of galaxies to trace the large-scale struc- 
ture of the universe and thereby to make cosmological inferences. We 
put special emphasis on our lack of knowledge about the relative distri- 
bution of galaxies and the dynamically important dark matter. We end 
with a discussion of the increasing importance of infrared astronomy to 
large-scale structure studies. 



1. Introduction 

Modern observational cosmology is done within a standard paradigm that has 
been in development since the invention of the concept of the expanding universe. 
It is worth reminding ourselves what the basic tenets of the paradigm are. 

• We live in a uniformly expanding universe, which originated in a Hot 
Big Bang. The evidence for the uniform expansion is overwhelming: in- 
dependent measurements of redshift and distance show that the two are 
proportional to impressive accuracy. This has been most spectacularly 
demonstrated recently with observations of high-redshift supernovae {e.g., 
Schmidt et al. 1998), which are found to obey the Hubble Law from 
z = 0.003 to 2 « 10. 

• The universe is homogeneous and isotropic on large scales, and thus is de- 
scribed in General Relativity by the Friedman-Robertson- Walker (FRW) 
metric. This is a statement of the Cosmological Principle, and is dramati- 
cally demonstrated by the isotropy of the Cosmic Microwave Background 
(CMB) to one part in 10^. 

• The large-scale distribution of matter grew via gravitational instability 
from initially low-amplitude fluctuations. The CMB indeed does show de- 
viations from isotropy, which are interpreted to be due to tiny fluctuations 
in the initial density field. Gravity amplifies the contrast between over- 
densities and underdensities, eventually leading to the structures that we 
see today. 



^Of course, the supernovae are showing evidence of deviations from the linear relation, which is 
interpreted to be due to the curvature of the Universe, but that is a different story. 
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• Dark matter dominates the mass density of the universe. There is strong 
evidence that most of the mass density of the universe is in a form that is 
not directly visible, manifesting itself only through its gravitational influ- 
ence, and that the baxyonic material makes up of order 10% or less of the 
mass density of the universe. 

This paradigm leaves many important questions unanswered. 

• The FRW metric is described by a number of parameters (not all inde- 
pendent), which we might hope to measure: fi, the mass density of the 
universe relative to the critical density; Hq, the Hubble Constant; A, the 
contribution to the curvature by vacuum energy; qq, the deceleration pa- 
rameter; and to, the current age of the universe. The quantity can be 
further divided into the contributions from different mass constituents: 
baryons, and hot and cold dark matter. Our model will not be complete 
until wc have estimates of these parameters, and show that they are mu- 
tually consistent within the model. 

• We must characterize the initial perturbations from which the present- 
day large-scale structure grew. To the extent that the Fourier modes of 
the initial perturbations had random phases, as expected in inflationary 

models, the povjer spectrum, gives a complete statistical description of the 
perturbations. If the fluctuations were seeded by discrete structures, as 
one expects in models involving phase transitions in the early universe, 
then the phases are correlated, and one needs higher-order statistics to 
describe the density field. 

• The physical nature of the dark matter remains unknown. We have good 
indirect evidence that most of it is non-baryonic, and because it is dark, we 
know that it does not interact strongly with photons. Its detailed nature 
will affect the shape of the power spectrum of fluctuations, so measure- 
ments of the power spectrum can shed light on the nature of dark matter. 

• Dark matter may dominate the mass density of the universe, but it is 
galaxies that we see. A complete model must describe how and when they 
formed, and how their formation is tied to the distribution of dark matter. 
Our theories can most easily predict the statistics of the distribution of 
dark matter; the relative distribution of galaxies and dark matter remains 
an important unknown in interpreting the observed large-scale structure 
of galaxies. 

In this paper, we do not attempt to answer all these rather broad and im- 
portant questions; rather, we discuss how to get a handle on all of them by 
measuring the large-scale distribution and motions of galaxies, with special em- 
phasis on the last problem mentioned above: the relative distribution of galaxies 
and dark matter. We will end with a brief description of the impact of infrared 
astronomy on the study of large-scale structure. 
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2. Large-Scale Structure Data 



Imperfect though they may be, galaxies are the tracers that we use to learn 
about the distribution of mass on large scales. 

The NGC was the first substantial catalog of galaxies compiled; it was 
apparent even from maps of these few thousand objects on the sky that they 
arc not distributed uniformly. In particular, the plane of what we now call the 
Local Supercluster is quite apparent in the NGC, with a particular concentration 
in the Virgo Cluster. Although we can learn a great deal from the projected 
distribution of galaxies on the sky, including the power spectrum and other 
clustering statistics, we need to know the distance of the objects in a galaxy 
catalog to quantify all the properties of the large-scale distribution of galaxies. 
The Hubble Law, which relates the redshift of a galaxy to its distance, allows us 
to tease out the all-important third dimension of the galaxy distribution. Since 
the first substantial redshift surveys for the study of large-scale structure in the 
late 1970s, surveys have grown in size and completeness; currently, the largest 
single redshift survey is that of Schechtman et al. (1996), which contains roughly 
25,000 galaxies. 

Even at low redshift, where cosmological curvature effects are unimpor- 
tant, the Hubble Law holds exactly only in the case of a perfectly homogeneous 
universe. The over- and under-densities we are concerned with here exert a 
gravitational pull on the adjacent galaxies. This gives galaxies a peculiar ve- 
locity, above and beyond the pure Hubble flow. The radial component of these 
velocities enters into the redshift: 

cz = Hor + r-[w{r)-v{0)]. (1) 

One can thus estimate the radial component of these peculiar velocities us- 
ing independent measurements of the redshifts cz, and distances, H^r, of galax- 
ies. Of course, the measurement of distances is inherently difficult, especially for 
extragalactic objects; the standard candles one uses (such as the Tully-Fisher 
relation, which allows one to estimate the absolute magnitude of a spiral galaxy, 
given its rotation speed) have substantial scatter. Nevertheless, peculiar velocity 
studies have been done for large samples of galaxies, which are starting to yield 
a picture of the large-scale velocity field. 

With these redshift and peculiar velocity data, one can carry out a number 
of studies: 

• We can ask for the large-scale topology of the galaxy distribution. Even 
if the initial density field satisfied the random-phase hypothesis, nonlinear 
gravitational growth will cause coherent structures to form. People have 
described the galaxy distribution as being in sheets, filaments, the walls of 
bubbles, and so on. Interestingly, there still is not a satisfying statistical 
method of describing the features that are so apparent to our pattern- 
seeking eyes. 

• We can calculate the power spectrum of the galaxy distribution. This 
has been distorted in various complicated ways from the initial power 

spectrum of the early universe due to nonlinear growth, the effects of 
peculiar velocities, and so on, but still has much to teach us about the 
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initial fluctuations, and by inference, the nature of dark matter, and the 
values of cosmological parameters. 



• We can calculate higher-order statistics of the clustering as well. As we 
have mentioned several times above, the nonlinear growth of clustering 
can take an initially random-phase distribution, and generate coherent 
structures, for which the power spectrum does not give the entire statistical 
description of the density field. One can calculate the growth of the higher- 
order clustering measures with time using perturbation theory, which can 
then be checked against observations; to the extent they agree, one has a 
consistency check on the hypotheses of an initially random-phase field and 
the growth of structure via gravitational instability. 

• We can relate the large-scale distribution of galaxies to the peculiar ve- 
locities. We believe that it is the gravitational infiuence of the density 
inhomogeneities in the galaxy distribution that causes the peculiar veloc- 
ities. In fact, to linear order in the fiuctuations, gravitational instability 
predicts a simple relationship between the velocity field v and the mass 
density field 6: 

V- v(r) = -!^°-6(5(r). (2) 

The status of all these analyses has been reviewed extensively in the litera- 
ture (for example, Strauss &: Willick 1995 and Strauss 1998a,b); here we want to 
stress the limitations of existing data and analyses. First, for many of the anal- 
yses that one would like to do in large-scale structure, existing datasets survey 
too small a volume. An obvious example is the power spectrum on large scales. 
On sufficiently large scales, one expects the power spectrum to have the same 
shape as a function of wavenumber (albeit with a much greater amplitude) as 
the initial power spectrum; moreover, the features in the power spectrum that 
are diagnostics of the cosmological parameters and the nature of the dark mat- 
ter manifest themselves on large scales, above 50 Mpc or so. However, in 
order to measure the clustering signal on these scales, one needs a substantial 
number of independent volumes of this size. Existing redshift surveys simply do 
not cover a large enough volume to get a high signal-to-noise measurement of 
the shape of the power spectrum on these scales and larger. 

Second, these data are generically affected by systematic errors. Many of 
the state-of-the-art large-scale structure studies are based on galaxy catalogs 
selected by eye off of photographic plates in the 1970s and earlier. As the 
clustering on large scales is weak, it takes only small inhomogeneities in the 
selection of the catalog to swamp the cosmological signal one is looking for. 

New and larger redshift surveys, such as the Sloan Digital Sky Survey 
(Gunn & Weinberg 1995; c/., jittp : / / www . astro . pr incet on . edu/BBOOK/ ) and 



the Two-degree Field redshift survey (Col less 1998; c/., 
|ittp : / /mso ■ anu . edu . au^colless/2dF7 ), will be able to address these prob- 



lems. But there is another problem, astrophysical in nature, which places a more 
fundamental limitation on the interpretation of large-scale structure, namely our 
lack of knowledge of the relative distribution of galaxies and dark matter. 
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3. How are Galaxies Distributed Relative to Dark Matter? 



The physics of dark matter is relatively simple: being collisionless (as we infer, 
given its lack of interaction with baryons or photons), gravitational physics de- 
scribes its evolution and its large-scale distribution in the context of any specific 
model. However, it is galaxies which we observe directly, and it is not at all 
obvious a priori whether the large-scale structure we observe in the galaxy dis- 
tribution mirrors that in the underlying, and presumably dynamically dominant, 
dark matter. 

Let us define the density field p(r) of either galaxies or dark matter, smoothed 
on a scale R. The Cosmological Principle states that the universe is homoge- 
neous on large scales; it makes sense to speak of a mean density (p) of the 
universe, thus we find it useful to define the density fluctuation field relative to 
this mean density: 

Sir) = (3) 

The simplest relation one could imagine between the galaxy and mass dis- 
tribution is that they are the same: 

^galaxies ~ matter • (4) 

However, it was realized in the early 1980s that this assumption was quite 
simplistic. In particular, if galaxies form preferentially in the regions of greatest 
dark matter density, one gcncrically expects the galaxies to be more clustered 
than the dark matter (Kaiser 1984; Bardeen et al. 1986). This idea gave theorists 
a new free parameter, the biasing parameter h, with which to fit their models; 
it was particularly valuable, for example, in reconciling the 17 = 1 CDM models 
with observations (Davis et al. 1985). (In retrospect, the value of h needed to fit 
the CDM model is quite unrealistic, but that is another story). In particular, in 
the appropriate limit of Kaiser's original formulation of bias, one expects that 
the galaxy and mass density fields should be proportional to one another: 

'^galaxies ~ ^<^daj:k matter- (5) 

This linear bias model has been the default for most analyses of large-scale 
structure. For example, people have used Equation (2) {e.g., Sigad et al. 1998) 
or its integral equivalent {e.g., Willick et al. 1997) to compare redshift and 
peculiar velocity data; assuming the linear bias relation, it can be phrased: 

V • v(r) = ^(5galaxies(r). (6) 

Thus, one ends up not measuring the really interesting quantity fi, but the 
somewhat awkward combination fi = il^'^/b. 

The linear bias model is just a parameterization of our ignorance, and is 
certainly over-simplistic, at least on small to moderate scales (say, less than 20 

Mpc). It has been known for a long time that the bias parameter must be 
a function of galaxy type. After all, one sees a preponderance of spiral galaxies 
in the field, but they are completely absent in the cores of rich clusters {e.g., 
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Dressier 1980). The large-scale distributions of galaxies of different types arc 
not identical. Moreover, the bias relationship must be nonlinear at some level; 
the morphology-density relation in clusters shows that the density of spirals 
is a non-monotonic function of the density of ellipticals; therefore they cannot 
both be linear functions of the underlying dark matter! The bias relationship is 
almost certainly not completely deterministic; the detailed physics that deter- 
mines whether a galaxy forms in a given place is not a function purely of the 
present-day dark matter density, and thus the galaxy density field depends also 
on additional physical quantities. The bias relationship must also be a function 
of smoothing scale R; on small scales, the rms density fluctuations are large, 

a = ((5^)^^^ S> 1, and there are regions in which 5 = —1, devoid of matter. But 
if 6 > 1 in Equation (5) (as models would suggest), one would have the un- 
physical situation of (5gaiaxies < — 1- Finally the bias relationship is a function of 
redshift: as gravity pulls galaxies and dark matter alike into deep potential wells, 
one expects b to approach unity with time. Indeed, Adelberger et al. (1998; see 
also Steidel, this conference) observe clustering at 2; pa 3 which is as strong as 
clustering today. Clustering of matter grows with time, so the underlying dark 
matter distribution should be appreciably weaker back then, implying that the 
effective galaxy bias was quite a bit stronger at z 3 than it is today. 

One way to get a handle on all these complications is to turn to cosmological 
simulations. In particular, Blanton et al. (1998) have examined the relative 
distribution of galaxies and dark matter in a simulation which models both the 
gravitational physics of dark matter and the gas physics of the baryons. The 
simulation handles star formation by converting gas into coUisionless particles 
in regions with infalling gas, with cooling times below the local dynamical time, 
and masses above the Jeans mass. One can then look at the relationship between 
the density field of these coUisionless particles (which we take as a proxy for the 
galaxy density field) and the dark matter density field. 

The resulting bias relationship is nonlinear, stochastic, and is a strong func- 
tion of galaxy age. These properties are revealed in Figure 1, which shows as a 
greyscale the conditional probability P{l + 6g\l + 6) and as the solid line the con- 
ditional mean {1 + Sg\l + 5), where all quantities are defined with a top hat filter 
of radius 1 Mpc. Each panel shows the results at z = for galaxies formed 
at different epochs, as labeled. Note that the oldest galaxies are the cleanest 
tracers of the dark matter distribution, in that the scatter around the mean 
galaxy-dark matter density relation is small. However, the youngest galaxies 
show a very nonlinear, even non-monotonic, relation with the dark matter; they 
are underrepresented in the very densest regions of the dark matter map (remi- 
niscent, indeed, of spirals in the cores of clusters, although in the real universe 
clusters still represent appreciable overdensities in the distribution of late-type 
galaxies; Strauss et al. 1992a). In addition, the scatter around the mean density 
relation for the youngest galaxies is quite sizeable. 

In these simulations, the relationship between galaxies and mass also de- 
pends on scale. In Figure 2, we show the bias b = Ogja calculated on various 
scales. The obvious scale-dependence of b is due to the dependence of the galaxy 
formation process on temperature. The temperature sets the local Jeans mass, 
which partly determines whether star-formation occurs: the higher the tem- 
perature, the greater the overdensity needed to form stars. On small scales 
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Mass Density =1+6 

Figure 1. Galaxy mass density as a function of dark matter density 
for each age quartilc, at 1 hr^ Mpc radius top hat smoothing. Each 
panel lists the range of formation redshifts included. Shading is a 
logarithmic stretch of the conditional probability P(l + 6g\l + S). Solid 
lines indicate {1 + Sg\l + S); dotted lines indicate the la deviation from 
the mean. 

the temperature is proportional to the gravitational potential (p. Note that in 
Fourier space, ^{k) oc S{k)/k^. For high k, then, there is little power in the po- 
tential or temperature fields; i.e. these fields arc smoother than the density field. 
Thus, temperature correlates over large scales; furthermore, on these large scales 
it correlates with density as well. Thus the dependence of galaxy formation on 
temperature can couple the galaxy density on small scales with the dark matter 
density on larger scales. As Blanton et al. (1998) show, this coupling causes 
scale-dependence of the bias relation. The dependence of galaxy formation on 
local gas temperature is likely to be important in any galaxy formation scenario; 
thus, this scale-dependence may be generic. 

The work ahead is evaluate the consequences of this rather complicated bias 
relationship for interpretations of statistical measures of large-scale structure. It 
is known, for example, that stochasticity in the bias relation will systematically 
affect analyses comparing peculiar velocity and density fields based on Equation 
(2) (Dekel & Lahav 1998); it may turn out that this effect is large enough 
to explain the discrepancy various workers are finding in analyses of existing 
datasets in the inferred value of (3. 

4. The Impact of IR Astronomy on Large-Scale Structure Studies 

What is the connection of all of this to the subject of this meeting? Infrared 
astronomy has had a large impact on the study of large-scale structure, and 
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1.1 > z > 0.6 



z < 0.6 
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R (ii-i Mpc) 

Figure 2. The bias b(R) = ag(R)/a{R), where R refers to the top 
hat smoothing radius. Solid hne indicates all galaxies. Dashed lines 
indicate each age quartile, with range of formation redshifts listed. 
Note the strong scale-dependence, and that old galaxies are more biased 
than young. 



promises to have an increasing role, as we argue in these final remarks. In many 
ways, the near-infrared is an optimal wavelength regime in which to study galax- 
ies for statistical and dynamical purposes. The problems with both foreground 
and internal extinction arc minimized, and the SEDs of the red stars that make 
up the bulk of the stellar mass of galaxies peak in the near-IR. However, it has 
only been relatively recently that infrared instrumentation has advanced to the 
point that large-scale surveys are possible. 

Thus it was realized quite early on that the near-IR was a particularly 
good place to do galaxy photometry for the Tully-Fisher relation (Aaronson, 
Huchra, & Mould 1979), but the first large-scale survey of galaxies on the sky 
in the near-IR, the Two-Micron All Sky Survey, is only being carried out now 
(Skrutskic, this meeting). 2MASS will be invaluable for large-scale structure 
studies for a number of reasons. Having two identical telescopes observing in the 
two hemispheres with the same instrumentation, with data reduced identically, 
will result in a uniform galaxy catalog free from the headaches of trying to match 
disparate catalogs from different regions of the sky (c/., Santiago et al. 1996). 
This, coupled with the small effect of the Zone of Avoidance in the K Band, 
means that one can study galaxy clustering on large angular scales effectively 
with the 2MASS data. The galaxy catalogs generated from the IRAS database at 
QOfiui (cf., Fisher et al. 1995) share many of these virtues, and indeed have been 
used for a large range of large-scale structure studies (as reviewed by Strauss 
Sz Willick 1995), but 2MASS will be sensitive to the early type galaxies that 
arc mostly absent in IRAS, and it will go appreciably deeper, with far better 
sampling of the density field. In particular, the dipole moment of the galaxy 



distribution can be compared with the pecuhar velocity of the Local Group to 
infer the depth at which large-scale flows converge; Strauss et al. (1992b) find 
hints from the IRAS data of possible contributions to the flows on scales above 
100 Mpc. 2MASS should be able to unambiguously nail this problem. The 
full analysis of large-scale structure with the 2MASS extragalactic database will 
require redshifts; the survey suffers from an embarrassment of riches, with of 
order 10^ galaxies. John Huchra is leading a group with the ambitious goal of 
measuring redshifts for the brightest 250,000 2MASS galaxies in K; this should 
be an absolutely spectacular dataset for the study of large-scale structure, which 
will make the IRAS redshift surveys quite obsolete. 

Most analyses of large-scale structure treat all galaxies as equal tracers of 
the density field. With our improved understanding of the messy problem of 
biasing, and in particular a realization that the bias of any given galaxy sample 
is a strong function of the way in which they were selected, we need to include 
galaxy properties explicitly in our analyses. Infrared astronomy is starting to 
give us a real understanding of where the bulk of the bolometric luminosity of 
galaxies is coming out; a real theme of this meeting was that SIRTF and other 
tools of infrared astronomy will finally give us a clear understanding of the full 
SEDs of galaxies of all different types. If we wish to have a real understanding 
of the distribution of galaxies in space, and in particular, its relation to the dark 
matter which dominates the dynamics, we need to have as unbiased (in both 
senses of the word!) a sample of galaxies as possible. We simply cannot learn 
how to do this properly until infrared astronomy gives us the tools to select 
galaxies and measure their simplest underlying physical properties. 
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